Analyzing Stylometric Approaches to Author Obfuscation
نویسندگان
چکیده
Authorship attribution is an important and emerging security tool. However, just as criminals may wear gloves to hide their fingerprints, so too may criminal authors mask their writing styles to escape detection. Most authorship studies have focused on cooperative and/or unaware authors who do not take such precautions. This paper analyzes the methods implemented in the Java Graphical Authorship Attribution Program (JGAAP) against essays in the Brennan-Greenstadt obfuscation corpus that were written in deliberate attempts to mask style. The results demonstrate that many of the more robust and accurate methods implemented in JGAAP are effective in the presence of active deception.
منابع مشابه
The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation - (Best of the Labs Track at CLEF-2017)
Users posting online expect to remain anonymous unless they have logged in, which is often needed for them to be able to discuss freely on various topics. Preserving the anonymity of a text’s writer can be also important in some other contexts, e.g., in the case of witness protection or anonymity programs. However, each person has his/her own style of writing, which can be analyzed using stylom...
متن کاملAn Author Profiling Approach Based on Language-dependent Content and Stylometric Features
We describe the approach that we submitted to the 2015 PAN competition [5] for the author profiling task. The task consists in predicting some attributes of an author analyzing a set of his/her Twitter tweets. We consider several sets of stylometric and content features, and different decision algorithms: we use a different combination of features and decision algorithm for each language-attrib...
متن کاملAuthor Obfuscation: Attacking the State of the Art in Authorship Verification
We report on the first large-scale evaluation of author obfuscation approaches built to attack authorship verification approaches: the impact of 3 obfuscators on the performance of a total of 44 authorship verification approaches has been measured and analyzed. The best-performing obfuscator successfully impacts the decision-making process of the authorship verifiers on average in about 47% of ...
متن کاملOverview of the Author Obfuscation Task at PAN 2017: Safety Evaluation Revisited
We report on the second large-scale evaluation of style obfuscation approaches in a shared task on author obfuscation, organized at the PAN 2017 lab on digital text forensics. Author obfuscation means to automatically paraphrase a given text such that state-of-the-art authorship verification approaches misjudge a given pair of documents as having been written by “different authors” if in fact t...
متن کاملPractical Attacks Against Authorship Recognition Techniques
The use of statistical AI techniques in authorship recognition (or stylometry) has contributed to literary and historical breakthroughs. These successes have led to the use of these techniques in criminal investigations and prosecutions. However, few have studied adversarial attacks, motivated by a desire to protect anonymity and privacy in a variety of scenarios, and their devastating effect o...
متن کامل